This graph shows how many women and men are in a given field of IT. It is easily seen that more men works in IT and that most of the women works in back and end developer field of IT and programming. The majority of IT specialists work as full stack developer, front end developer and also at back end developer. To plot this graph we selected columns that consists of developer types and gender to calculate how many of them work in given field.
This graph shows how many people in different state of their IT career use Stack overflow. It is easily seen that student, who are young and have low experience use Stack overflow all the time and they also have Stack overflow account. More experienced people and users don’t depend on the Stack overflow as much as their younger coworkers. They also don’t have an account. To plot the graph we selected columns that corresponds frequency of visits, having an account and years of professional coding. We used ggplot library with geoem_jitter() method.
This graph shows the most popular and the most used operating system in given country. We can choose between Linux, Windows, MacOS and systems which based on Linux. The most popular operating system in Europe and America is Windows. In Africa in few countries IT specialists and students use MacOS (Mali, Niger, Namibia, Botswana) and Linux (Mozambique, Kenya, Senegal, Guinea) in major. In Asia in Japan and Thailand dominates MacOS. To show this graph we used geopandas library and matplotlib.
This graph presents the mean Satisfaction and Usage of Programming Languages ordered by Paradigm. Table which consists of list of languages is ordered by paradigm. The graph on the left shows the mean satisfaction scores of programming language paradigms, where each segment represents a paradigm and the distance from the center indicates the average satisfaction level. The graph on the right displays the count of programming languages within each paradigm. Both graphs provide insights into the relationship between programming language paradigms and job satisfaction, as well as the distribution of languages across different paradigms.
In a separate Python file, prior to generating graphs, we meticulously pre-processed various datasets. These datasets comprised the developer survey data from 2018, including files delineating gender and developer types, salary alongside years of coding, and operating systems categorized by country.
Initially given person has many values in one column called developer type. We divided this types separately and we created a new matrix with gender and developer type with ‘0’ and ‘1’ values.
| X | Gender | System.administrator | Student | Game.or.graphics.developer | Designer | Embedded.applications.or.devices.developer | Marketing.or.sales.professional | Back.end.developer | Data.scientist.or.machine.learning.specialist | Data.or.business.analyst | Full.stack.developer | DevOps.specialist | C.suite.executive..CEO..CTO..etc.. | Engineering.manager | Mobile.developer | Product.manager | Database.administrator | QA.or.test.developer | Front.end.developer | Educator.or.academic.researcher | Desktop.or.enterprise.applications.developer |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Male | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | Male | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| 3 | Male | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | Male | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
| 5 | Male | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 |
Here we also created a new dataset, which consists of information about having an account, frequency of using a Stack overflow and experience (in years) of a given person.
We created a new dataset which consists of name of a given country and name of the operating system they use in major. Some countries names aren’t the same as in the geopandas library, that’s why we modified some of them.
| X | Country | OperatingSystem |
|---|---|---|
| 0 | Kenya | Linux-based |
| 1 | United Kingdom | Linux-based |
| 3 | United States of America | Windows |
| 4 | South Africa | Windows |
| 5 | United Kingdom | Linux-based |
We created a function which takes a DataFrame containing information about programming languages worked with and job satisfaction, expands the DataFrame to include one column for each programming language, and fills these columns based on the languages worked with by each respondent.
| X | JobSatisfaction | TypeScript | Objective.C | Cobol | Julia | Rust | Groovy | F. | VBA | C. | Perl | Matlab | VB.NET | C.. | Kotlin | JavaScript | Lua | Scala | Hack | C | Assembly | Visual.Basic.6 | PHP | Swift | CoffeeScript | Bash.Shell | Ocaml | Java | Go | CSS | SQL | R | Python | Haskell | Erlang | HTML | Delphi.Object.Pascal | Ruby | Clojure |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Extremely satisfied | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
| 1 | Moderately dissatisfied | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | Neither satisfied nor dissatisfied | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 4 | Slightly satisfied | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 5 | Moderately satisfied | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
This graph shows the distribution of grouped developers types in different companies based on their sizes. We check how many data scientist, developers, managements are in given company. In each company it is seen that developer field is the most common one and has the majority of the employees. This graph, similar as fourth one is interactive.
This graph shows the correlation between professional coding and non professional coding in years. It also shows the salary of a given group of people. Majority of people in the data set work in IT field from 2 to 15 years and their salary is average paid. People who work about 20-30 years earn much more money - about 100000 USD per year. In the graph there is a sample of 100 surveyed, because of that graph is more readable and transparent. It is also easy to analize that people start coding 5 years before their career - it is probably because fo their studies or courses. Data set is from 2023.
Similarly, we processed data from the 2023 developer survey, encompassing files detailing time and search responses, knowledge distribution and frequency, age juxtaposed with years of experience, developer types in relation to organizational size, and salary aligned with years of coding.
We created a function which groups developer types into broader categories, sorts organization sizes, and saves the resulting DataFrame to a CSV file for further analysis.
| X | DevType | OrgSize | Grouped_DevType |
|---|---|---|---|
| 50737 | Developer, full-stack | Just me - I am a freelancer, sole proprietor, etc. | Developer |
| 22682 | Developer, back-end | Just me - I am a freelancer, sole proprietor, etc. | Developer |
| 79944 | Developer, desktop or enterprise applications | Just me - I am a freelancer, sole proprietor, etc. | Developer |
| 79941 | Developer, front-end | Just me - I am a freelancer, sole proprietor, etc. | Developer |
| 47581 | Developer, full-stack | Just me - I am a freelancer, sole proprietor, etc. | Developer |
We created a new dataset which consists of years coding professionally and non professionally. The code we wrote selects certain columns related to salary and coding experience, drops any rows with missing values, and renames the columns for clarity. We also standardized the salary to USD.
| X | Salary | SalaryUSD | YearsCode | YearsCodePro | SalaryUSD.1 |
|---|---|---|---|---|---|
| 1 | 285000 | 285000 | 18 | 9 | 285000 |
| 2 | 250000 | 250000 | 27 | 23 | 250000 |
| 3 | 156000 | 156000 | 12 | 7 | 156000 |
| 4 | 1320000 | 23456 | 6 | 4 | 23456 |
| 5 | 78000 | 96828 | 21 | 21 | 96828 |